ELRA Validation Methodology and Standard Promotion for Linguistic Resources

نویسندگان

  • Hanne Fersøe
  • Monica Monachini
چکیده

This paper describes the results of work made for ELRA during 2003-2004. It describes the methodology for validation of written language resources (WLRs), specifically lexica, which has been developed for ELRA and tested on a few resources in the ELRA catalogue. It discusses the importance of key issues in lexicon creation and validation such as the adoption of standards for the coding of linguistic content and the importance of documentation. It reports on the experience gained from applying the methodology to lexical resources in the ELRA catalogue arguing that the checks must be reasonable, informative, on a suitable level of detail, and generic. It proposes a set of basic elements to be included in future discussions on establishing standards for lexicon resources. In conclusion it sketches the work to be undertaken in 2004 to promote validation and the adoption of standards.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quality control of language resources at ELRA

To promote quality control of its language resources the European Language Resources Association (ELRA) installed a Validation Committee. This paper presents an overview of current activities of the Committee: validation of language resources, standardisation, bug reporting, patches of updates of language resources, and dissemination of results.

متن کامل

Unified Lexicon and Unified Morphosyntactic Specifications for Written and Spoken Italian

The goal of this paper is (1) to illustrate a specific procedure for merging different monolingual lexicons, focusing on techniques for detecting and mapping equivalent lexical entries, and (2) to sketch a production model that enables one to obtain lexical resources via unification of existing data. We describe the creation of a Unified Lexicon (UL) from a common sample of the Italian PAROLE/S...

متن کامل

The Workshop Programme

I will talk about core issues in quality control such as how we define quality in the case of language resources, how much variation there is in the definition and what this means for implementing quality control procedures. I think this is important because I have seen many publications that seem to take the approach that quality is single dimension and that our primary task is to move ourselv...

متن کامل

ELRA contribution to bridge the gap between industry and academia

The European Language ResourcesAssociation (ELRA) was created in February 1995 to handle all issues related to Language Resources. ELRA’s missions and activities include the collection, distribution, validation of speech, text, terminology resources and tools. Very recently ELRA has launched a new activity regarding the evaluation (of technologies, systems, prototypes, services, etc.). After fi...

متن کامل

Recent Developments within the European Language Resources Association (ELRA)

The main achievement of ELRA (the most visible) is the growth of its catalogue. The ELRA catalogue as of April 2000 lists 111 speech resources, 50 monolingual lexica, 113 multilingual lexica, 24 written corpora and 275 terminological databases. However, many Language Resources (LRs) need to be identified and/or produced. To this effect, ELRA is active in promoting and funding the co-production ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004